Categorical Database Generalization Aided by Data Model
نویسندگان
چکیده
This paper focuses on the issues of categorical database generalization and emphasizes the roles of supporting data model, integrated data model, spatial analysis and semantic analysis in database generalization. The framework contents of categorical database generalization transformation are defined. The paper presents an integrated spatial supporting data structure, a semantic supporting model and similarity model for categorical database generalization. The concept of transformation unit is proposed in generalization. The paper concludes with an application of categorical database generalization. Keywords: Categorical Database Generalization, Data Model, Hierarchy, Semantic Evaluation Model, Transformation, Transformation Unit. 1. INTRODUCTION The main objective of database generalization is to derive a new database with different (coarser) spatial/thematic/temporal resolutions from existing more detail database(s), for a particular application. To a large database, efficiency in storage and access to multi-scale and multiple representation data as well as complex generalization operators need to be supported by powerful data model and data structure. Although existing data models such as Delaunay triangulation networks (Delaunay 1934) and Formal data structure (Molenaar 1989, 1991,1995) are applied to support automated generalization , The development in this area (Peng 1997) is still early stage and requires the investment of much more effort. Categorical database generalization relies on the exploitation of hierarchies which are inherent to spatial data (Molenaar 1996; Richardson 1994; Martinez Casanovas 1994, 1996; Smaalen 1996). For categorical database research, several people have done some works. Molenaar (1996) proposes four strategies of database generalization. Robin Fuller and Nigel Brown (1999) discuss automation generalization the land cover map of Great Britain. Wang (2001) elaborates area geometric aggregation. Yet, these past studies emphasize on the geometric and visualization aspects only. For example, an area with a size smaller than 25 x 25 m will be eliminated; the gap between two area features smaller than 5 pixels will be bridged. But there is lack of method research of supporting data model, statistics analysis, semantic analysis and spatial analysis in categorical database generalization. Supporting data model, statistics analysis, semantic analysis and spatial analysis play a key role in the operations in generalization. The following part of the paper is organized as following, the semantic supporting data model for database generalization transformation are defined, followed by transformation model and semantic similarity analysis model are elaborated. Then the concept of transformation unit for transformation process is proposed. Finally the examples are demonstrated. 2. SEMANTIC SUPPORTING MODEL FOR DATABASE GENERALIZATION The contents of categorical database are always closely related with a taxonomic system. i.e., soil database with soil taxonomy system and landuse database with landuse taxonomic system etc. The taxonomic system is used in the real world to establish hierarchies of classes that permit us to understand, as fully as possible, the relationships among entities and between entities and properties which are responsible for their character in the real world. Some concepts must be defined before discussing semantic model. 2.1 Class, Object type and Object In this paper, the class and object type have the same meaning. A class or object type is defined by the attributes shared. A class or object type determines a set of attributes to form its attribute structure. An object is an instance of an object type or class. The attribute structure of objects is determined by the class to which they belong, so that each object has an attribute structure list containing one value for every attribute of its class. An object inherits the attribute structure of a class to which it belong to. The thematic description of an object can now be specified by its class (which specifies the attributes for the object) together with the list of attribute values. 2. 2 Classification Hierarchy and Aggregation Hierarchy Classification and aggregation hierarchy play a key role in defining conceptual data model of categorical database since the object types in the conceptual data model are meaningful within a certain classification and aggregation hierarchy. Before a categorical database can be built, the classification structure must be chosen (Molenaar 1998) and aggregation structure must be specified. The classification hierarchy is used in the context of database. A classification hierarchy is expressed as an object type hierarchy that represents levels of object specificity. Furthermore, a classification hierarchy as an abstraction type organizes levels of both objects and object type definition and reflects the abstract level of objects in the database. For categorical database which is always related to a taxonomic system in a certain application filed, a classification hierarchy is derived from the taxonomic system. In this sense, we can say that the object types in the classification hierarchy correspond to the classes in the taxonomic system. The super object types and sub object types in the classification hierarchy correspond to the super classes and sub classes respectively. The objects of the object type correspond to the entities of the class. This system can be easily transformed into classification hierarchy in the database (see Figure 1). Figure 1. Classification hierarchy. Aggregation hierarchy is expressed as how a higher-order object is organized by lower-order object types that belong to different classification hierarchy in the sense that the aggregation hierarchy can be derived from the classification hierarchy with a certain application purpose. Even though we can specify the relations between higher-order object type and lower order object types to build an aggregation hierarchy, the specifying relations are normally based on the classification hierarchies. In function, the classification hierarchy will help us to find the objects we need, because it has sorted and categorized them in the categorical database. Once we have found them, the aggregation hierarchy tells us what to do to put them together meaningfully, such as an aggregation of river and road object types into a transportation network develops a significantly different definition from the individual definitions of river classification and road classification (see Figure 2) Figure 2. Aggregation hierarchy. For the categories database, object types, attribute structure of each object type and relationships among object types in the conceptual data model are normally decided by the object types at the lowest level in the classification hierarchy or aggregation hierarchy in categorical database. As shown in Figure3.2, the object type rice, maize and so on at the level 1 in classification hierarchy consist of the object types of conceptual data model in landuse database. 3. TRANSFORMATION MODEL OF DATABASE The database generalization can be considered as the transformation of the content of a spatial database from high resolution to a lower resolution terrain representation (Molenaar 1996). In fact, database generalization is a transformation from one existing state of a database at certain detail level to a new state at less detail on the basis of the application and useers requirements (see Fig.3). Figure 3. Transformation Model. The state of database (SDB) can be specified by a five Tuple: SDB={ M, O, R , C ,P} Where: SDB is the state of database at a certain detail level; M is the set of conceptual data model, M={m i }; O is the set of objects, O={o 1 ,o 2 , o 3 , ,o i }; R is the set of relationships among the objects, R={r | r∈O×O}; C is a set of conditions or constraints for transformation. P is a set of operators In categorical database transformation, several aspects must be taken into account. 3.1 Conceptual data model transformation Conceptual data model is an abstraction of real world of interest field. It consists of object types and relationships among the object types in the context of database. It plays an important role in database transformation. It determines what object types and which instances of these object types, should be contained in the database. It determines the degree of detail of the target database and the contents of database as well. Database is the instance of conceptual data model. In a sense, we can say that database generalization is the transformation from one data model of an existing database to another data model of a generalized database based on the application purpose and requirements. This means that if the user introduces a new conceptual data model, it will lead to a database transformation from high resolution to lower resolution. For the categorical database, the conceptual data model of a database has a close relationship with the classification and aggregation hierarchy and taxonomic system in application filed . The classification and aggregation hierarchies play an important role in linking the definition of spatial objects at several scale levels ( Molenaar 1996, Peng 1997, Peng and Tempfli 1997, Richardson 1993 and Smaalen 1996) and definition of spatial object types at several scale levels. These hierarchies play an essential role in defining the conceptual data model of the categorical database. Before transforming an existing database to a new database, a new data model associated the new database must be defined.
منابع مشابه
A generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences
The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...
متن کاملComparison of Categorical Foundations of Object-Oriented Database Model
The present development in the database area is highly influenced by the object-oriented principles of data modeling. On the contrary to the previously successful relational approach, it lacks rigorous theoretical support. This problem is mainly due to the fact that the semantics description of such features as generalization, specialization, encapsulation, and inheritance is not trivial by usi...
متن کاملAn Ensemble Classification Model for the Diagnosis of Breast Cancer Using Stacked Generalization
Introduction: Breast cancer is one of the most common types of cancer whose incidence has increased dramatically in recent years. In order to diagnose this disease, many parameters must be taken into consideration and mistakes are possible due to human errors or environmental factors. For this reason, in recent decades, Artificial Intelligence has been used by medical practitioners to diagnose ...
متن کاملUsing Vector and Raster-Based Techniques in Categorical Map Generalization
Categorical data are a frequent data type in GIS and thematic cartography. Therefore, comprehensive methodologies for the generalization of categorical data in both the vector and the raster model are urgently needed. After the presentation of a general framework and recommended workflow, generic cartographic constraints governing the generalization of categorical data are specified. In the nex...
متن کاملAn Algebraic Topological Approach to Privacy: Numerical and Categorical Data
In this paper, we cast the classic problem of achieving k-anonymity for a given database as a problem in algebraic topology. Using techniques from this field of mathematics, we propose a framework for k-anonymity that brings new insights and algorithms to anonymize a database. We begin by addressing the simpler case when the data lies in a metric space. This case is instrumental to introduce th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003